Matching experiments across species using expression values and textual information

نویسندگان

  • Aaron Wise
  • Zoltán N. Oltvai
  • Ziv Bar-Joseph
چکیده

MOTIVATION With the vast increase in the number of gene expression datasets deposited in public databases, novel techniques are required to analyze and mine this wealth of data. Similar to the way BLAST enables cross-species comparison of sequence data, tools that enable cross-species expression comparison will allow us to better utilize these datasets: cross-species expression comparison enables us to address questions in evolution and development, and further allows the identification of disease-related genes and pathways that play similar roles in humans and model organisms. Unlike sequence, which is static, expression data changes over time and under different conditions. Thus, a prerequisite for performing cross-species analysis is the ability to match experiments across species. RESULTS To enable better cross-species comparisons, we developed methods for automatically identifying pairs of similar expression datasets across species. Our method uses a co-training algorithm to combine a model of expression similarity with a model of the text which accompanies the expression experiments. The co-training method outperforms previous methods based on expression similarity alone. Using expert analysis, we show that the new matches identified by our method indeed capture biological similarities across species. We then use the matched expression pairs between human and mouse to recover known and novel cycling genes as well as to identify genes with possible involvement in diabetes. By providing the ability to identify novel candidate genes in model organisms, our method opens the door to new models for studying diseases. AVAILABILITY Source code and supplementary information is available at: www.andrew.cmu.edu/user/aaronwis/cotrain12.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparison of interestingness measures applied to textual taxonomies matching

This paper presents an experimental comparison of Interestingness Measures (IMs), in the context of an approach designed for matching textual taxonomies. This extensional and asymmetric approach makes use of association rule model for matching entities issued from two textual hierarchies. We select 6 IMs and we perform two experiments on a benchmark composed of two textual taxonomies and a set ...

متن کامل

Ontologies improve cross-species phenotype analysis

As phenotype data analysis has become an important component of functional genomics, many methods for analyzing these data have been published in the recent past. For example, RNA interference (RNAi) in mice has significantly improved our understanding of gene regulation, even for human disease. However, as phenotypes are obtained through species-specific experiments, they are usually described...

متن کامل

Machine Learning in studying gene regulation

Researchers have been increasingly relying on cross species analysis to understand how biological systems operate. Sequence based methods have been successfully applied to identify and characterize coding and functional non coding regions in multiple species [5]. However, sequence information is static and thus provides only partial view of cellular activity. More recent studies attempt to inte...

متن کامل

Textual Enhancement across Linguistic Structures: EFL Learners' Acquisition of English Forms

The benefits of textual input enhancement in the acquisition of linguistic forms have produced mixed results in SLA literature. The present study investigates the effects of textual enhancement on adult foreign language intake of two English linguistic forms-subjunctive mood and inversion structures-to explore the role of the type of linguistic items in input enhancement studies. It also invest...

متن کامل

Shapes Matching and Indexing using Textual Descriptors

We propose in this paper a new matching and indexing method of shapes. Models of objects silhouettes are stored in the database using their textual descriptors. As we will see, XLWDOS descriptors are sensitive to noise. We propose a “reduction technique” to process noisy shapes and match corresponding XLWDOS descriptors using only “textual transformations”. The matching algorithm we propose is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 28  شماره 

صفحات  -

تاریخ انتشار 2012